Networks

This report explores the clustering of nouns aquired by 16 - 30 month old children using word2vec similarity, and the dynamics of those clusters.

Nouns

We begin by importing data from wordbank. Age of aquisition (in months, it starts with 16, 17,..) is that at which 50% of children can produce the word. We also do naive triming at this point, this means we are ignoring homophone/polysemy (chiken (food), chiken (animals), etc,.. but we will have to be careful in our choices if we are preparing later for publication.

full_vocab <- make_vocab_dataframe(lang="English (American)",
                                   lang_form = "WS",
                                   lex_class = "nouns") %>%
  trim_all_unilemma() %>%
  trim_all_definition() %>%
  arrange(age) %>%            # keep first age for duplicates
  distinct(uni_lemma, .keep_all = TRUE) %>%
  select(uni_lemma, definition, age, category)

The initial vocabulary contains 288 words.

Word2Vec

Features taken from McRae et al. (2005), in which features were collected for 541 nouns from 725 adults, with 30 adults providing 14 features for each noun. Word2Vec similarities are from child directed speach (source Abdellah).

vocab <- full_vocab %>%
  filter(age <= 30) %>%
  filter_to_w2v()

To begin our analysis, we only consider the words for which word2vec data are available, an overlap of 277 in our data. The missing words are toy, rock, brush, pasta, chips, comb, slide, dress, living room, nail, green beans.

Build Network

We construct a network with an edge attribute for word2vec similarity.

w2v_links <- make_w2v_links(vocab = vocab) #TODO still problem w/ vocab matching w2v data

network <- graph_from_data_frame(w2v_links, vertices = vocab, directed = FALSE) %>%
  as_tbl_graph() %E>%
  mutate(weight = w2v_similarity) # layout and clustering algorithms automatically use the weight attribute

Analysis

Layout the Network

First we generate a display layout for the network, based on the word2vec data.

if (FALSE) {
  layout <- layout_network(network)
  saveRDS(layout, file = "layout.rds")
} else {
  layout <- readRDS(file = "layout.rds")
}
  
network <- network %N>%
  mutate(x = layout[,1]) %>%
  mutate(y = layout[,2])

plot_vocab_network(network,
                   frame=FALSE,
                   plot_title=title("Word2Vec Network",cex.main=1),
                   labels="",
                   edge_color = rgb(0,0,0,alpha = (E(network)$weight)^(5/2)))

Clustering algorithms

Walktrap

par(oma=c(0,0,0.2,0))

for (age_limit in seq(16, 30, by=2)) {

  net <- network %N>% filter(age <= age_limit)
  clusters <- cluster_walktrap(net,
                               weights = E(net)$w2v_similarity)
  
  filename <- str_glue("images/walktrap_network_{age_limit}.png")
  png(filename=filename)
  plot_vocab_network(network %N>%
                       filter(age <= age_limit),
                     clusters = clusters$membership,
                     frame = FALSE,
                     labels = "",
                     edge_color = rgb(0,0,0,alpha = (E(network)$weight)^(5/2)))
  dev.off()
  
  cat("<table style='width:100%'><col width='30%'><col width='70%'><tr>")
  
  cat("<td valign='top'>")

  cat(str_glue('<img src="{filename}">'))
  
  cat("</td><td valign='top'>")
  
  print(kable_clusters(clusters,
                 title=str_glue("Age = {age_limit} Months")))
  cat("</tr></table>")
  

}
Age = 16 Months (2 Clusters)
ball balloon book cookie shoe bottle
dog kitty
Note:
Color indicates number of clusters in which a word occurs: 1
Age = 18 Months (4 Clusters)
ball balloon book shoe bottle bird duck car truck apple sock
cookie banana bubbles cheese juice
ear eye nose owie
dog kitty cat
Note:
Color indicates number of clusters in which a word occurs: 1
Age = 20 Months (3 Clusters)
dog kitty ball book cookie bird cat duck banana bear bug fish puppy flower bee bunny cow horse pig train doll telephone door rain tree
balloon shoe bottle car truck bubbles apple cheese juice sock airplane boat cracker milk water diaper hat cup keys spoon tv blanket light bed chair
ear eye nose owie foot hair hand mouth toe tooth belly button tummy
Note:
Color indicates number of clusters in which a word occurs: 1
Age = 22 Months (3 Clusters)
dog kitty ball balloon book bottle bird cat duck car truck bear bug puppy airplane boat hat cup spoon tv flower bee bunny cow horse pig train doll blanket light telephone bed chair door rain tree mouse bicycle paper pillow potty moon swing frog monkey bus block pen button bowl box clock fork towel watch bathroom bathtub stairs pool
cookie banana bubbles apple cheese juice fish cracker milk water bread cheerios soap chicken cake candy cereal egg grapes ice ice cream orange pizza money
shoe ear eye nose owie sock diaper foot hair hand mouth toe tooth keys belly button tummy drink french fries shirt arm butt finger head toothbrush bib coat pants knee leg
Note:
Color indicates number of clusters in which a word occurs: 1
Age = 24 Months (3 Clusters)
cookie banana bubbles apple cheese juice fish milk bread cheerios chicken cake candy cereal egg grapes ice ice cream orange pizza beans butter coffee corn food hamburger pancake peas pickle popcorn raisin sandwich soda soup toast grass carrots meat
dog kitty ball balloon book bottle bird cat duck car truck bear bug puppy airplane boat cracker water cup spoon tv flower bee bunny cow horse pig train doll light telephone chair door rain tree mouse bicycle paper soap potty moon swing frog monkey bus block pen button bowl box clock fork money towel watch bathroom bathtub stairs pool ant butterfly elephant lion teddybear tiger turtle motorcycle stroller crayon broom medicine napkin picture plate kitchen shower table window shovel sky sun animal owl present peanut butter glass tissue star
shoe ear eye nose owie sock diaper hat foot hair hand mouth toe tooth keys belly button tummy blanket bed drink french fries shirt arm butt finger head pillow toothbrush bib coat pants knee leg boots jacket pajamas shorts cheek chin face tongue glasses purse
Note:
Color indicates number of clusters in which a word occurs: 1
Age = 26 Months (3 Clusters)
dog kitty ball balloon book cookie bottle bird cat duck car truck bubbles bear bug puppy airplane boat cracker water cup spoon tv flower bee bunny cow horse pig train doll blanket light telephone bed chair door rain tree mouse bicycle paper pillow soap potty moon swing frog monkey bus block pen egg button bowl box clock fork money towel watch bathroom bathtub stairs pool ant butterfly elephant lion teddybear tiger turtle motorcycle stroller crayon broom medicine napkin picture plate purse kitchen shower table window grass shovel sky sun animal owl present peanut butter glass tissue star sheep firetruck helicopter pencil puzzle popsicle zipper knife penny plant tape vacuum bedroom couch crib high chair refrigerator room sink snow street alligator giraffe lamb squirrel zebra bat story pumpkin basket camera garbage hammer radio scissors trash closet garage hose sandbox stick
banana apple cheese juice fish milk bread cheerios chicken cake candy cereal grapes ice ice cream orange pizza beans butter coffee corn food hamburger pancake peas pickle popcorn raisin sandwich soda soup toast carrots meat gum jelly potato strawberry turkey chocolate donut yogurt
shoe ear eye nose owie sock diaper hat foot hair hand mouth toe tooth keys belly button tummy drink french fries shirt arm butt finger head toothbrush bib coat pants knee leg boots jacket pajamas shorts cheek chin face tongue glasses lips belt slipper sweater lawn mower
Note:
Color indicates number of clusters in which a word occurs: 1
Age = 28 Months (3 Clusters)
dog kitty ball balloon book cookie bottle bird cat duck car truck bubbles bear bug puppy airplane boat cracker water cup spoon tv flower bee bunny cow horse pig train doll blanket light telephone bed chair door rain tree mouse bicycle paper pillow soap potty moon swing frog monkey bus block pen egg button bowl box clock fork money towel watch bathroom bathtub stairs pool ant butterfly elephant lion teddybear tiger turtle motorcycle stroller crayon pancake broom napkin picture plate purse kitchen shower table window grass shovel sky sun animal owl present peanut butter glass tissue star sheep firetruck helicopter pencil puzzle popsicle zipper knife penny plant tape vacuum bedroom couch crib high chair refrigerator room sink snow street alligator giraffe lamb squirrel zebra bat story pumpkin basket camera garbage hammer radio scissors trash closet garage hose sandbox stick tractor game necklace penis bucket can dish drawer oven rocking chair stove yard cloud ladder wind lamp
banana apple cheese juice fish milk bread cheerios chicken cake candy cereal grapes ice ice cream orange pizza beans butter coffee corn food hamburger peas pickle popcorn raisin sandwich soda soup toast carrots meat gum jelly potato strawberry turkey chocolate donut yogurt applesauce jello lollipop muffin nuts vitamins pudding
shoe ear eye nose owie sock diaper hat foot hair hand mouth toe tooth keys belly button tummy drink french fries shirt arm butt finger head toothbrush bib coat pants knee leg boots jacket pajamas shorts cheek chin face tongue glasses medicine lips belt slipper sweater lawn mower underpants
Note:
Color indicates number of clusters in which a word occurs: 1
Age = 30 Months (3 Clusters)
cookie banana apple cheese juice fish cracker milk bread cheerios chicken cake candy cereal egg grapes ice ice cream orange pizza beans butter coffee corn food hamburger pancake peas pickle popcorn raisin sandwich soda soup toast carrots meat gum jelly potato strawberry turkey chocolate donut yogurt applesauce jello lollipop muffin nuts vitamins pudding play dough coke melon salt sauce tuna pretzel
dog kitty ball balloon book bottle bird cat duck car truck bubbles bear bug puppy airplane boat water cup spoon tv flower bee bunny cow horse pig train doll blanket light telephone bed chair door rain tree mouse bicycle paper pillow soap potty moon swing frog monkey bus block pen button bowl box clock fork money towel watch bathroom bathtub stairs pool ant butterfly elephant lion teddybear tiger turtle motorcycle stroller crayon broom medicine napkin picture plate purse kitchen shower table window grass shovel sky sun animal owl present peanut butter glass tissue star sheep firetruck helicopter pencil puzzle popsicle zipper knife penny plant tape vacuum bedroom couch crib high chair refrigerator room sink snow street alligator giraffe lamb squirrel zebra bat story pumpkin basket camera garbage hammer radio scissors trash closet garage hose sandbox stick tractor game necklace penis bucket can dish drawer oven rocking chair stove yard cloud ladder wind lamp dryer flag snowman deer goose penguin pony rooster wolf tricycle chalk glue mop washing machine garden roof sidewalk sprinkler
shoe ear eye nose owie sock diaper hat foot hair hand mouth toe tooth keys belly button tummy drink french fries shirt arm butt finger head toothbrush bib coat pants knee leg boots jacket pajamas shorts cheek chin face tongue glasses lips belt slipper sweater lawn mower underpants shoulder gloves jeans mittens sneaker
Note:
Color indicates number of clusters in which a word occurs: 1

River Plot

subgraphs_by_age <- get_subgraphs_by_age(network, cluster_walktrap)
#out.width='100%', fig.asp=1, dpi=300
river <- cluster_river(subgraphs_by_age = subgraphs_by_age)

plot(river, gravity = "center", plot_area = 0.8)

suppressWarnings(precursors_graph(subgraphs_by_age, 0)) %>% plot_river_network()